Large-scale Imputation for Complex Surveys

نویسندگان

  • David A. Marker
  • David R. Judkins
  • Marianne Winglee
چکیده

Much of the recent research into imputation methodology has focused on developing optimal procedures for a single variable or set of variables, where the patterns of missingness and underlying distributions follow standard distributions. In contrast, it is frequently necessary to impute for many variables from a single survey, with an even larger set of potential covariates and complex covariance structures among the variables to be imputed. Further, the imputations need to be completed in a relatively short time frame within a constrained budget. The analyst also is unlikely to be able to anticipate all of the important analyses for which the imputed data are to be used. This often prevents analysts from being able to produce optimal imputations for each variable. Instead, one tries to produce a set of imputed variables that minimize the attenuation of key relationships, hopefully reduces nonresponse bias, and satisfies the time and budgetary constraints.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Imputation in a Complex Sample Survey

Multiple imputation for missing survey data is relatively new concept. As defined by one of its leading proponents, "multiple imputation is the technique that replaces each missing or deficient value with two or more acceptable values representing a distribution of possibilities" (Rubin 1987, p.2). Multiply-imputed data reflects the uncertainty contained in the imputation process in a way not p...

متن کامل

Combining synthetic data with subsampling to create public use microdata files for large scale surveys

To create public use files from large scale surveys, statistical agencies sometimes release random subsamples of the original records. Random subsampling reduces file sizes for secondary data analysts and reduces risks of unintended disclosures of survey participants’ confidential information. However, subsampling does not eliminate risks, so that alteration of the data is needed before dissemi...

متن کامل

Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys

In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian, joint modeling approach to multiple imputation fo...

متن کامل

Multiple imputation in a large-scale complex survey: a practical guide.

The Cancer Care Outcomes Research and Surveillance (CanCORS) Consortium is a multisite, multimode, multiwave study of the quality and patterns of care delivered to population-based cohorts of newly diagnosed patients with lung and colorectal cancer. As is typical in observational studies, missing data are a serious concern for CanCORS, following complicated patterns that impose severe challenge...

متن کامل

Analysis of Large - Scale Social Surveys 1

Large-scale social surveys are an important source of information for a wide range of topics. In analyzing such surveys, it is important to be aware of the complexity of the sampling design and the data adjustments that are used by survey organizations, including weighting to adjust for diierences between sample and population and imputation to ll in missing responses. For estimating population...

متن کامل

Bayesian Multiple Imputation for Large-Scale Categorical Data with Structural Zeros

We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. Thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999